Proteomic insight into arabinogalactan utilization by particle-associated Maribacter sp. MAR_2009_72

Abstract Arabinose and galactose are major, rapidly metabolized components of marine particulate and dissolved organic matter. In this study, we observed for the first time large microbiomes for the degradation of arabinogalactan and report a detailed investigation of arabinogalactan utilization by the flavobacterium Maribacter sp. MAR_2009_72. Cellular extracts hydrolysed arabinogalactan in vitro. Comparative proteomic analyses of cells grown on arabinogalactan, arabinose, galactose, and glucose revealed the expression of specific proteins in the presence of arabinogalactan, mainly glycoside hydrolases (GH). Extracellular glycan hydrolysis involved five alpha-l-arabinofuranosidases affiliating with glycoside hydrolase families 43 and 51, four unsaturated rhamnogalacturonylhydrolases (GH105) and a protein with a glycoside hydrolase family-like domain. We detected expression of three induced TonB-dependent SusC/D transporter systems, one SusC, and nine glycoside hydrolases with a predicted periplasmatic location. These are affiliated with the families GH3, GH10, GH29, GH31, GH67, GH78, and GH115. The genes are located outside of and within canonical polysaccharide utilization loci classified as specific for arabinogalactan, for galactose-containing glycans, and for arabinose-containing glycans. The breadth of enzymatic functions expressed in Maribacter sp. MAR_2009_72 as response to arabinogalactan from the terrestrial plant larch suggests that Flavobacteriia are main catalysts of the rapid turnover of arabinogalactans in the marine environment.


Introduction
Marine environments contain many different polysaccharides as dissolv ed or ganic matter (DOM) or in particulate or ganic matter (POM).These are a vital carbon source for micr oor ganisms, r eleased from algae as exudates or during lysis by zooplankton predation or viral infection.Monosaccharide analysis of planktonic biomass from the North Sea revealed already in 1982 a dominance of glucose follo w ed b y arabinose , galactose , and mannose (Ittekkot et al. 1982, Urbani et al. 2005, Alderkamp et al. 2007, Scholz and Liebezeit 2013, Huang et al. 2021 ).These monomers are the building blocks of algal polysaccharides: the abundant betahomoglycans laminarin, cellulose, and xylan are often complemented with species-specific glycans such as agar, alginate, carr a geenan, fucoidan, mannan, pectin, por phyr an, and ulv an.The degradation of these glycans has been studied intensively in marine systems, ho w e v er, details for ar abinogalactan ar e missing (Bäumgen et al. 2021 ).Recentl y ar abinogalactan was detected in the high molecular weight dissolved organic matter (HMWDOM) and POM fraction using monoclonal antibodies during the algal spring bloom in the North Sea (Vidal-Melgosa et al. 2021 ).This coincides with the high arabinose and galactose content of Phaeocystis spp., a haptophyte blooming in the North Sea (Alderkamp et al. 2007, Sato et al. 2018 ).The antibody-based quantification also sho w ed a decrease in arabinogalactan content to w ar ds the end of the spring bloom, suggesting a fast turnover of the compound-contrasting with the accumulation of fucose-containing sulfated pol ysacc harides (Vidal-Melgosa et al. 2021 ).The major source of arabinose and galactose in algae are likely arabinogalactan proteins, whic h anc hor pol ysacc haride cell walls in the outer membrane of plants and algae (Silva et al. 2020, Leszczuk et al. 2023 ).The model compound for arabinogalactan type II is arabinogalactan from larch wood.It contains d -galactose and l -arabinose in a 6:1 molar ratio as well as traces of rhamnose , fucose , mannose , xylose , and d -glucuronic acid (Fujita et al. 2019, Villa-Riv er a et al. 2021, Leszczuk et al. 2023 ).Type II arabinogalactans have a complex backbone structure consisting of β-1,3-linked galactan backbone with β-1,6-linked galactan side chains (Kelly 1999 , Wang and LaPointe 2020 ).Type I has a β-1,4-linked galactan backbone, whereby C3 can be linked with l -arabinofuranose (Hinz et al. 2005 ).
Plant arabinogalactan is degraded by aerobic bacteria and fungi as well as by anaerobic fermenting bacteria in gut systems, including Bifidobacterium and Bacteroidetes (Shulami et al. 2011, Ndeh et al. 2017, Cartmell et al. 2018, Luis et al. 2018, Wang and La-Pointe 2020, Sasaki et al. 2021 ).The latter phylum encompasses also aerobic Flavobacteriia that have been identified as specialists for pol ysacc haride degr adation in marine systems (Sidhu et al. 2023 ).For this first study on the degradation of arabinogalactan by marine micr oor ganisms , we selected a fla vobacterial strain with a published genome and a particle-associated lifestyle, Maribac-ter sp.MAR_2009_72 (Kappelmann et al. 2018, Heins et al. 2021a ).Strains of the genus Maribacter are rarely isolated from sea water, but they are more abundant in particle fractions (Nedashko vska ya et al. 2004, Heins and Harder 2023, Lu et al. 2023, Sidhu et al. 2023 ).Abundances of up to 4% were detected in the oxic surface layer of sandy sediments (Probandt et al. 2018, Miksch et al. 2021 ).Even higher abundances were observed in micro-and macroalgae phycosphere populations (Heins et al. 2021b, Lu et al. 2023 ).This makes Maribacter strains ideal candidates for studying the degradation of algal cell wall pol ysacc harides.
The uptake and degradation of pol ysacc harides in Bacteroidetes is often encoded in pol ysacc haride utilization loci (PULs).The first PUL was described for Bacteroides thetaiotaomicron for starch utilization (Shipman et al. 2000 ).Pol ysacc haride utilization starts with the extracellular hydrolysis of polysaccharides into oligosaccharides on the surface of the cell.The oligosaccharides are transported into the periplasm via the SusC/D transport system, which is energized by a proton gradient via an ExbB/D-TonB system in the cytoplasmic membrane and by a domain in the periplasm to open the β-barrel channel of SusC for the transport (Noinaj et al. 2010 ).The hydr ol ysis of pol ysacc harides is ac hie v ed by gl ycoside hydr olases (GH), gl ycoside tr ansfer ases, pol ysacc haride lyases, and carbohydrate esterases with a high specificity, sometimes assisted by carbohydrate binding modules .T hese five groups of proteins are classified as carbohydrate active enzymes (CAZymes) (Bäumgen et al. 2021, Drula et al. 2022 ).For the degradation of arabinogalactan from lar ch w ood, PULs w ere so far characterized for gut bacteria including Bifidobacterium longum ssp.longum NCC2705, Bacteroides caccae ATCC 43185, and Bacteroides thetaiotaomicron (Ndeh et al. 2017, Cartmell et al. 2018, Luis et al. 2018, Wang and LaPointe 2020 ).Here, we analyzed Maribacter sp.MAR_2009_72 proteomes using cells grown on arabinogalactan, arabinose , galactose , and glucose .T hose pr oteomes wer e compared to identify the proteins induced by arabinogalactan.This study expands a recent in silico study that did not report on ar abinogalactan-specific PULs (Ka ppelmann et al. 2018 ) and provides experimental observations for a better interpretation of marine metagenomes.

Growth experiments
Maribacter sp.MAR_2009_72 (DSM 29384), originally isolated from a phytoplankton catch in the Wadden Sea near the island Sylt, Germany, was revived from glycerol stocks that had been pr eserv ed in the laboratory since the initial isolation (Hahnke and Harder 2013 ).The strain was grown in the liquid medium HaHa_100 V with 0.3 g/l of casamino acids as the sole carbon source (Hahnke et al. 2015 ).This limited growth to an optical density (OD) at 600 nm belo w 0.2.Gro wth bey ond an OD of 0.3 was ac hie v ed by adding 2 g/l of a carbohydrate source, here arabinose , galactose , glucose (Sigma Aldric h/Merc k KGaA, Darmstadt, German y), and larc h ar abinogalactan (The Dairy Sc hool, Auc hincruive , Scotland).T he supplier of arabinogalactan had specified the monosaccharide composition as 81% galactose, 14% arabinose, and 5% other, whereby the other fraction was not defined.For pr oteomics, thr ee cultur es of 50 ml were inoculated with 0.4% v/v of a pr egr own cultur e in the same medium and incubated at r oom temper atur e at 110 r/m.A fourth cultur e per substr ate was maintained to monitor bacterial growth by measuring OD at 600 nm beyond the harvest point.Cells were harvested at an OD of 0.25.Cells were pelleted by centrifugation in 50 ml tubes with 3080 × g for 30 min at 4 • C. Pellets were resuspended in 1 ml medium and centrifuged in 1.5 ml tubes at 15870 × g for 15 min at 4 • C. The wet biomass was weighed and stored at −20 • C.
For microbiome size determinations, colony-forming units (CFU) were determined with 4 g/l lar ch w ood arabinogalactan as organic carbon source on marine plates (Hahnke and Harder, 2013), using 4 g/l glucose or ZoBell's 2216 marine agar plates as r efer ence.Inoculation of serial diluted sea or sediment pore water was performed with a 96 pin-holder.Inoculations were at room temper atur e. P artial 16S rRNA gene sequences of str ains wer e obtained by colony PCR and Sanger sequencing (Hahnke and Harder 2013 ).Partial 16S rRNA gene sequences have been deposited at GenBank under the accession numbers PP600029 to PP600099.

Protein prepar a tion and mass spectrometry
Pr oteins wer e extr acted fr om cells using a bead-beating method following the protocol by Schultz et al. ( 2020 ).A pellet of wet weight ranging from 20 to 200 mg was disrupted using 0.25 ml glass beads in 500 μl of lysis buffer.The protein content was quantified using the Roti Nanoquant assay (Carl Roth, Karlsruhe, German y).For pr otein purification on denaturing pol yacrylamide gels (SDS-PAGE), 50 μg of protein was combined with 10 μl of 4x SDS buffer [composed of 20% gl ycer ol, 100 mM Tris/HCl, 10% (w/v) SDS, 5% β-mercaptoethanol, 0.8% bromophenol blue, pH 6.8] and loaded onto Tris-glycine-extended precast 4%-20% gels (Bio-Rad, Neuried, German y).Electr ophor esis was conducted at 150 V for 8 min.Subsequently, the gel was fixed in a solution of 10% v/v acetic acid and 40% v/v ethanol for 30 min, stained with Brilliant Blue G250 Coomassie, and the desired protein band was excised.The proteins were extracted from the gel in one piece and then washed with a solution of 50 mM ammonium bicarbonate in 30% v/v acetonitrile .T he gel pieces were dried using a SpeedVac (Eppendorf, Hambur g, German y), and then r ehydr ated with 2 ng/ μl trypsin (sequencing grade trypsin, Promega, USA).After a 15-min incubation at room temperature, excess liquid was removed, and the samples were digested overnight at 37 • C. Following digestion, the gel pieces were covered with water suitable for mass spectrometry (MS), and peptides were eluted using ultrasonication.The peptides wer e subsequentl y desalted using Pierce™ C18 Spin Tips (Thermo Fisher, Schwerte, Germany) in accordance with the manufacturer's guidelines .T he eluted peptides were dried using a SpeedVac and stored at −20 • C. For MS analysis, the samples were thawed and reconstituted in 10 μl of Buffer A (99.9% acetonitrile + 0.1% acetic acid).
Tryptic peptides of Maribacter sp.MAR_2009_72 were analyzed using an EASYnLC 1200 system coupled to a Q Exactive HF mass spectrometer (Thermo Fisher Scientific, located in Waltham, USA).P eptides w er e loaded onto a custom-pac ked anal ytical column containing 3 μm C18 particles (Dr.Maisch GmbH, Ammerbuch, Germany).The loading was performed using buffer A (0.1% acetic acid) at a flow rate of 2 μl/min.Peptide separation was achieved through an 85-min binary gradient, transitioning from 4% to 50% buffer B, composed of 0.1% acetic acid in acetonitrile, at a flow rate of 300 nl/min.Samples were measured in parallel mode; survey scans in the Orbitr a p wer e r ecorded with a resolution of 60 000 with a m/z range of 333 to 1650.The 15 most intense peaks per scan were selected for fragmentation.Precursor ions were dynamicall y excluded fr om fr a gmentation for 30 s. Single-c har ged ions as well as ions with unknown c har ge state were rejected.Internal lock mass calibration was applied (lock mass 445.12003Da).
The MS files were analyzed in MaxQuant version 2.2.0.0 in the standard settings against the strain specific protein F igure 1. Gro wth curve of Maribacter sp.MAR_2009_72 in presence of four different carbon sources; arabinogalactan, arabinose, galactose, and glucose.MAR_2009_72 was grown in 50 ml of modified HaHA100V with 2 g/l of the r espectiv e carbon source at room temperature at 110 r/m.The OD was measured at 600 nm.
For the visualization of the data the following pr ogr ams and pac ka ges wer e used: R v ersion 4.3.2(R Cor e Team 2023 ), ggplot2 (W ickham 2016 ), gggenes (W ilkins 2023 ), and Pr oksee (Gr ant et al.

).
The MS proteomics data have been deposited to the Pro-teomeXchange Consortium via the PRIDE (Perez-Riverol et al. 2022 ) partner repository with the dataset identifier PXD049074 and 10.6019/PXD049074.

Growth on arabinogalactan
Maribacter sp.MAR_2009_72 gr e w in presence of larch wood arabinogalactan to a maximum OD of 0.338 and at a maximum gr owth r ate μ = 0.06 h −1 (Fig. 1 ).When 2 g/l of galactose or arabinose were provided in the medium a maximum OD of 0.419 and 0.446 was measured with respective growth rates of 0.07 h −1 and 0.06 h −1 .Glucose supported the largest biomass formation, with an OD of 0.526 and μ = 0.05 h −1 .The ar abinogalactan cultur es r equir ed mor e time to enter the exponential gr owth phase than the cultur es with monosacc harides as substr ates .T he physiological

Protein expression in Maribacter sp. MAR_2009_72
The compar ativ e pr oteomic anal ysis was based on glucose as r efer ence a gainst ar abinose , galactose , and arabinogalactan.We identified 1874 proteins in the arabinogalactan proteome (Fig. 2 A).Ov er all, these four conditions shared 1636 pr oteins.Onl y a small number of proteins were found to be unique to a particular growth condition.The glucose proteome had 36 unique pr oteins, the ar abinose proteome 17 proteins, and the galactose proteome 19 proteins.Arabinogalactan had 52 unique proteins.We used the expr ession data, her e label fr ee quantification intensities (LFQ), to visualize the difference between the four conditions in a principal component analysis (PCA) (Fig. 2 B).The PCA plot indicated that the ar abinogalactan pr oteome had the most contr asting expr ession pattern.The PCA analysis documented that the differences between monosacc haride pr oteomes wer e less pr onounced than to the arabinogalactan proteome.
Maribacter sp.MAR_2009_72 has a genome of 4.35 Mb encoding 3635 proteins (Fig. 3 ).Nine PULs contain one or several SusC/D transporter and neighboring C AZymes .We labelled the PULs based on the arrangement in the genome, with PUL 1 being closest to the origin of replication ( Table S1 , Supporting Information ).The expr ession v alues r e v ealed a pr oteomic r esponse to ar abinogalactan in PUL 1, 7, and 8 and outside of PULs.
PUL 1 encodes 13 proteins of which three out of four CAZymes and one SusC/D pair wer e expr essed in ar abinogalactan gr own cells (Fig. 4 ).The SusC/D pair (JM81_RS00910 and JM81_RS00905) was onl y expr essed in the ar abinogalactan pr oteome.Four other pr oteins wer e clearl y induced by ar abinogalactan, ar abinose, and galactose .T he αl -ar abinofur anosidase GH43_1 (JM81_RS00875) was 10-fold induced r elativ e to the glucose proteome.A GH10, an endo-β-1,4-xylanase, sho w ed a similar expression pattern with a 5-fold difference to glucose .T he third enzyme was a GH67, an α-glucur onidase, whic h had the str ongest induction in ar abinose and galactose proteomes .T he fourth induced protein of the operon with an expression in the arabinogalactan proteome affiliated to the superfamily of protein or cofactor modifying RimK-type glutamate ligases with an ATP-grasp binding domain (JM81_RS00865).
PUL 7 contains a single SusC/D pair and a tandem of SusC/D pair in one genetic region.It encodes 42 enzymes, 13 being classified as C AZymes , three SusC/D pairs , and one sulfatase (Fig. 5 ).One SusC/D pair and 6 CAZymes were expressed in arabinogalactan grown cells.SusC (JM81_RS13730) and SusD (JM81_RS13725) wer e expr essed in the galactose and ar abinogalactan pr oteomes 100-fold and 10-fold stronger than in the ar abinose pr oteome, r espectiv el y, and not in the glucose proteome .T he tandem SusC/D pairs were not detected in any of the proteomes.An αl -fucosidase of the GH29 family (JM81_RS13700) had the highest expression among the CAZymes in this PUL.The GH29 was expressed in similar intensities in all four conditions, suggesting a constitutive expression of this periplasmic enzyme .Less intense , but also expressed in all proteomes was a xylan-α-1,2-glucuronidase belonging to the GH115 family (JM81_RS13820), with the strongest expression on galactose.Two GH105 unsaturated rhamnogalactur-on yl hydr olases (EC 3.2.1.172)(JM81_RS13845 and JM81_RS13890) wer e expr essed in all four gr owth conditions, with the exception of JM81_RS13890, which was not detected in the arabinose proteome.A GH43_18 (JM81_RS13895) was expressed in all four proteomes with similar expression intensities.An αl -rhamnosidase GH78 (JM81_RS13900) was expressed under all growth conditions.During our analysis a hypothetical protein (JM81_RS13825) with a six-hairpin GH like family domain sparked our interest.It was expressed in all four proteomes, with higher intensities in arabinogalactan, arabinose, and galactose proteomes.
PUL 8 encodes a total of 58 proteins, including 11 CAZymes and two SusC/D pairs (Fig. 6 ).A total of five C AZymes , two SusCs but only one SusD were expressed in arabinogalactan grown cells.SusC (JM81_RS16585) and SusD (JM81_RS16590) were expressed in the arabinose and arabinogalactan proteome.Another SusC (JM81_RS16455) sho w ed expr ession, slightl y lo w er than the other SusC, in the arabinose proteome and slightly less for arabinogalactan.Two GH105 proteins (JM81_RS16470 and JM81_RS16475) annotated as unsatur ated rhamnogalactur on yl hydr olases wer e expressed similar in all proteomes.JM81_RS16510 includes two domains, GH43_19 and GH43_34.It was expressed in the arabinose, arabinogalactan, and galactose proteome, whereby the highest intensities were measured for arabinose.Another αl -ar abinofur anosidase, a GH51 (JM81_RS16515), was expressed in a similar pattern to the GH43_19 + GH43_34 protein.These two genes are followed by genes of the arabinose metabolism to the pentose phosphate pathwa y-ribulokinase , l -ribulose-5phosphate 4-epimerase, and l -arabinose isomerase-and a gene for a galactose m utar otase.All pr oteins in this oper on wer e expressed in the arabinose, arabinogalactan, and galactose proteome, with highest intensities in ar abinose pr oteomes.Unknown is the function of a GH109, a member of the Gfo/ldh/MocA superfamily of NAD(P) dependent oxidoreductases, that had the highest expression in the arabinogalactan proteome .T he expression of a mannonate dehydratase (JM81_RS16615) hinted at a sugar acid metabolism.Inter estingl y, PUL 8 is pr eceded by an operon with sugar acid metabolizing enzymes .T he following enzymes were induced in the arabinogalactan proteome in comparison to glucose: 5-dehydr o-4-deoxy-d -glucur onate isomer ase, gluconate-5-dehydrogenase, a sugar kinase, 2-dehydro-3-deoxyphosphogluconate aldolase, and ta gatur onate r eductase.
An analysis with dbCAN3 identified 153 CAZymes in the genome, of which 106 were detected in the proteomes.Outside of the PULs 1, 7, and 8, se v er al CAZymes wer e expr essed in arabinogalactan degr adation.Man y expr essed CAZymes had a signal peptide for export out of the cytosol ( Table S1 and Fig. S2 , Supporting Information ).Three of the CAZymes were annotated as GH family 3 enzymes.JM81_RS00095 was expressed in all four conditions, the highest intensities were measured in the arabinogalactan proteome ( Fig. S2A , Supporting Information ).The second GH3 (JM81_RS08450) was expressed in all four conditions, but with a three to four times lar ger expr ession in ar abinogalactan, arabinose, and galactose ( Fig. S2B , Supporting Information ).A third GH3 (JM81_RS18250) was as well expressed in all four conditions, but the highest intensities wer e measur ed for arabinose and galactose.It was part of an operon also including an endo-1,4-β-xylanase (GH10) expressed only in arabinose and galactose grown cells ( Fig. S2C , Supporting Information ).All three GH3 were annotated as galactosidases.A GH43_26 (JM81_RS08585) was expressed in all four datasets, whereby the highest intensities wer e r ecorded for arabinose and nearly identical LFQs for glucose and arabinogalactan ( Fig. S2D , Supporting Information ).A GH115 xylan-α-1,2-glucuronidase (JM81_RS03245) was only expressed in arabinose and arabinogalactan grown cells ( Fig. S2E , Supporting Information ).
The transport of the monosaccharides across the inner membrane may be facilitated by an ABC transport system consisting of ABC substrate-binding (JM81_RS03610), ABC permease (JM81_RS16840), and ABC ATP binding proteins (JM81_RS01625).
Marine glycans are often decorated with sulfate.We identified 13 sulfatases in the genome of MAR_2009_72, of which three wer e expr essed in ar abinogalactan gr own cells.JM81_RS05685, JM81_RS05692, and JM81_RS076760 were equally expressed in all four proteomes.All three were previously affiliated with the utilization of m ucin, whic h contains to some extent galactose (Tailford et al. 2015, Glover et al. 2022 ).

Discussion
Galactose belongs to the four abundant monosaccharides in planktonic or ganic matter, mainl y as part of pol ysacc harides and more complex molecules , i.e .arabinogalactan proteins .Plating sea and sediment pore water on arabinogalactan medium sho w ed a large microbiome with the capacity to utilize arabinogalactan for growth.Together with the recent finding that particleassociated bacteria dominate the r eadil y cultur able fr action of seater microbiomes (Heins and Harder 2023 ) this observation indicates that arabinogalactan is a common carbon source for particle-associated bacteria.
Ar abinogalactan degr adation pathw ays w er e so far onl y described for bacteria from gut and plant systems, but not for marine bacteria (Shulami et al. 2011, Ndeh et al. 2017, Cartmell et al. 2018, Luis et al. 2018, Fujita et al. 2019, Wang and LaPointe 2020, Sasaki et al. 2021 ).These studies provided information regar ding enzymes inv olv ed in ar abinogalactan utilization, whic h includes GH families GH43, GH51, GH27, and GH28, often orga-nized in PULs (Shulami et al. 2011, Cartmell et al. 2018, Luis et al. 2018 ).Hence, we inspected first the upregulated proteins in arabinogalactan grown cells in comparison to glucose grown cells.After a discussion of the SusC/D systems, we analyzed the uniqueness of marine PULs for arabinogalactan degradation in Maribacter sp.MAR_2009_72.
The transport of the oligosaccharides involved several SusC/D pairs.PULs 1, 7, and 8 encode the three SusC/D systems that had the highest expression intensities of all SusC/Ds in the arabinogalactan proteome.On the basis of the dedicated substrate specificity of SusC/D transport systems, we propose two explanations for the induction of se v er al SusC/D pairs: (i) the extracellular hydr ol ysis of larc h wood ar abinogalactan gener ates a mixtur e of structur all y differ ent oligosacc harides whic h need dedicated transport system and (ii) a signal molecule derived from larch wood arabinogalactan may induce the expression of proteins that may not be necessary for larch wood arabinogalactan, but for the degradation of marine arabinogalactans .T he structural diversity of arabinogalactans in terrestrial system is well characterized (Fujita et al. 2019, Villa-Riv er a et al. 2021, Leszczuk et al. 2023 ), but marine arabinogalactans are understudied.
In the periplasm the oligosaccharides are further hydrolyzed by a range of C AZymes .Some PULs (1 and 7) expressed enzymes that can generate monomers .Furthermore , the proteome detected CAZymes that are not encoded in PULs and are predicted to be periplasmatic.The GH10 of PUL 1 was annotated as an endo-1,4-β-xylanase, which indicates that arabinoxylans may also be a substrate for the enzymes of PUL 1.The expression of an αglucuronidase annotated to GH67, coincides with the presence of glucuronic acid in side chains of arabinogalactan.GH67 removes glucur onic acid fr om side c hains by a single displacement mechanism using an inverting mechanism (Shulami et al. 1999, Biely et al. 2000, Nagy et al. 2002 ).But it onl y r emov es glucur onic acid fr om nonreducing ends of the oligo-and polysaccharides.A broader substr ate r ange is known for GH115 pr oteins, whic h r emov e glucuronic acid from terminal and internal regions of oligosaccharides (Ryabova et al. 2009, Aalbers et al. 2015 ).The presence of both GH families, GH67 and two GH115, suggests that glucuronic acid is part of the decoration of arabinogalactans .T he expression of the GH29 argues for fucose as a decorating sugar.Enzymes of the family GH29 are exo-α-fucosidases and cleave via an retaining mec hanism (Gr ootaert et al. 2020 ).Also, rhamnose as specific substrate is supported by expression of a GH78, αl -rhamnosidase.This GH famil y solel y includes rhamnosidases, whic h use an inv erting mec hanism to hydr ol yze bonds in cooper ation with their catal ytic r esidues (Cui et al. 2007 ).The galactan bac kbone hydr olysis r equir es a βd -galactosidase .T his enzymatic function is frequent among members of the GH family GH3.The proteome detected thr ee expr essed GH3 pr oteins.Final steps of the ar abinogalactan pathway include the translocation through the inner membr ane, likel y via an ABC transport system, and cytoplasmic transformations to channel galactose , arabinose , glucuronic acid, rhamnose, and fucose into the pentose phosphate pathway and the gl ycol ysis.
We investigated the distribution of PUL 1, 7, and 8 of Maribacter sp.MAR_2009_72 in the PULDB database using the expressed CAZymes (Terr a pon et al. 2018 ).Homologs of PUL 1 have been c har acterized for human gut bacteria and Bacteroides spp.for the utilization of a range of xylan polysaccharides including arabinoxylan (Martens et al. 2008, Rogowski et al. 2015, Wang et al. 2016 ).The PUL was in silico detected in genomes of a large variety of Bacteroidota.In contrast, PUL 7 has so far not been studied experimentally.An in silico search detected a homologous PUL structure in Maribacter sedimenticola DSM19840 (Nedashko vska ya et al. 2004 ).PUL 8 has also a homolog in M. sedimenticola DSM19840 and other Bacteroidota .
A r ecent meta genomic study of particle-associated bacteria detected a GH43-rich PUL in a Maribacter MA G , which the authors annotated as an arabinogalactan PUL (Wang et al. 2024 ).This PUL is different to the PULs we identified for arabinogalactan in the genome of Maribacter sp.MAR_2009_72.
Our observ ations r e v ealed a substr ate specificity of the thr ee PULs.In PUL 1, arabinogalactan is the only inducer for SusC/D, and the expression of a glucuronidase and a xylanase suggests that also glycans with these sugars are substrates for the PUL ( Fig. S3 , Supporting Information ).This hypothesis is supported by pr e vious studies with gut bacteria (Martens et al. 2008, Rogowski et al. 2015, Wang et al. 2016 ).PULs 7 and 8 have so far not been experimentally observed.PUL 7 is characterized by a v ery str ong induction of SusC/D by galactose and ar abinogalactan ( Fig. S4 , Supporting Information ).Galactose is for se v er al pr oteins the strongest inducer, suggesting galactans as substrate .T he presence of fucosidase , glucuronidase , and rhamnosidase suggests a decoration of the marine galactans with the corresponding monosaccharides.PUL 8 is dedicated to arabinose containing glycans .T he SusC/D is induced by arabinose and arabinogalactan ( Fig. S5 , Supporting Information ).Besides GHs, the genetic region of PUL 8 includes also monosacc haride-tr ansforming cytoplasmatic enzymes for arabinose and sugar acids .T his PUL shows that the consideration of cytosolic carbohydrate-transforming enzymes in the bioinformatic analysis of PULs ma y impro ve predictions of substrate specificity.
The compar ativ e pr oteomic anal ysis of larc h wood ar abinogalactan degradation by Maribacter sp.MAR_2009_72 identified expr essed pr oteins encoded in thr ee PULs and outside of PULs (Fig. 3 ).In summary, members of the GH families 43, 51, and 105 may produce a variety of oligosaccharides.At least three SusC/D systems are involved in the transport into the periplasm, where enzymes belonging to the GH families 3, 10, 29, 67, 78, and 115 produce monosaccharides .T he interpla y of all these enzymes allows for the utilization of ar abinogalactan, whic h we have summarized in a gr a ph (Fig. 7 ).The plant pol ysacc haride structur e is expected to be less complex than the variety of arabinogalactans present in the marine habitat (Pfeifer et al. 2020 ).T his ma y explain why not all CAZymes of each PUL were detected as expressed proteins.A difference between this study of a marine bacterium and pr e vious studies on gut and plant associated bacteria was the presence of GH105 enzymes and the absence of GH27 and GH28 enzymes.Future studies might characterize marine arabinogalactans and enzymatic studies will r esolv e the individual functions of the induced proteins to provide further information on the microbial utilization.

Figure 2 .
Figure 2. Comparison of the number of detected proteins in ar abinogalactan, ar abinose , galactose , and glucose .(A) Venn dia gr am showing the ov erla p of detected proteins in at least one of three biological replicates.(B) Principal component analysis shows the differences between the expression intensities of the four proteomes of MAR_2009_72.

Figure 3 .
Figure 3. Full genome ov ervie w of Maribacter sp.MAR_2009_72 showcasing the GC content (ring one (most inner ring)), all annotated coding genes (CDS, ring two and three) in forward and r e v erse dir ection, C AZymes identified by dbC AN3 (ring four), SusC/D (ring five), sulfatases (ring six), and PULs (ring se v en).Furthermor e, we highlighted CAZymes and SusC/Ds that might be important for arabinogalactan utilization.

Figure 4 .
Figure 4. Gene organization and expression of polysaccharide utilization locus 1 of Maribacter sp.MAR_2009_72 grown in the presence of ar abinogalactan, ar abinose , galactose , and glucose .Expression intensities in the plot are the mean values of three biological replicates of each condition shown in LFQ values [log10].

Figure 5 .
Figure 5. Gene organization and expression of polysaccharide utilization locus 7 of Maribacter sp.MAR_2009_72 grown in the presence of ar abinogalactan, ar abinose , galactose , and glucose .Expression intensities in the plot are the mean values of three biological replicates of each condition shown in LFQ values [log10].

Figure 6 .
Figure 6.Gene organization and expression of polysaccharide utilization locus 8 of Maribacter sp.MAR_2009_72 grown in the presence of ar abinogalactan, ar abinose , galactose , and glucose .Expression intensities in the plot are the mean values of three biological replicates of each condition shown in LFQ values [log10].